Oct 23, 2018

Chapter 2 Section 1 Objectives

Frequency and Distributions for Organizing and Summarizing Data


  • Develop an ability to summarize data in the format of a frequency distribution and a relative frequency distribution.
  • For a frequency distribution, identify values of class width, class midpoint, class limits, and class boundaries.

Definitions

  • Fequence distribution (frequency table) - data partitioned among several categories by listing categories along with the number of data values in each of them.
  • Lower class limit - Smallest numbers that can belong to each different class.
  • Upper class limit - Largest numbers that can belong to each different class.
  • Class Boundaries - Numbers used to separate classes, but without gaps created by gap limits.
  • Class Midpoints - Values in the middle of the classes.
  • Class width - Difference between lower class limits.

Constructing a Frequency Table

  1. Select a number of classes. Generally between 5 and 20.
  2. Calculate class width. \[ \textrm{Class width} \approx \frac{(\textrm{Max data value} - (\textrm{Min data value}))}{\textrm{number of classes}} \] Round up!
  3. Choose the lower class starting value by using the minimum value or a convenient value below the minimum.
  4. Then use the lower limit and class width to create the other lower classes.

Frequency Table Setup Example

Lets assume we have a set of data with a minimum value of 51 and a maximum value of 223. We have 6 desired classes.

  1. We select 6 classes
  2. \(\frac {223 - 51}{6} = 28\frac{2}{3}\) Class width is 29
  3. let 51 be the lower class limit.

Frequency Table

Lower class limit Upper class limit
51 79
80 108
109 137
138 166
167 195
196 224

Lets do it in Excel

Open up the chapter 2 excel document from moodle.

On sheet 1 there are some random values from 51 to 223.

Click Data \(\rightarrow\) Select Data Analysis \(\rightarrow\) Select histogram \(\rightarrow\) Click OK

excel

Relative Frequency Distribution

Open up the chapter 2 excel document from moodle.

On sheet 1 there are some random values from 51 to 223.

Click Data \(\rightarrow\) Select Data Analysis \(\rightarrow\) Select histogram \(\rightarrow\) Click OK

excel

Normal Distribution

Frequencies start out low and increase in the middle, then tend toward lower frequencies. This distribution is approximately symmetric.

Chapter 2 Section 2 Objectives

Frequency and Distributions for Organizing and Summarizing Data


  • Develop the ability to picture the distribution of data in the format of a histogram or relative frequency histogram.
  • Examine a histogram and identify common distributions, include a uniform distribution and a normal distribution.

Definitions

  • ** Histogram ** - a graph consisting of bars of equal width drawn adjacent to each other. Horizontal scales represents classes of quantitative data values and vertical scales represent frequencies.

Types of distributions

Create a Histogram in Excel

Install XLSTAT

Click XLSTAT Tab \(\rightarrow\) Select Visualizing Data \(\rightarrow\) Select Histogram

excel

Chapter 2 Section 3 Objectives

  • Develop the ability to graph data using a dotplot, stem plot, time-series graph, Pareto Chart, pit chart, and frequency chart.
  • Determine when a graph is deceptive through the use of a nonzero axis or a pictograph that uses an object of area or volume for one-dimensional data.

Dotplots

Stemplots

## 
##   The decimal point is 2 digit(s) to the right of the |
## 
##   0 | 5677799
##   1 | 0011111122
##   1 | 55888888
##   2 | 123
##   2 | 556
##   3 | 4

Time Series Plot

Pareto Graph

##   count        defect cum
## 4    94  contact num.  94
## 1    80    price code 174
## 3    66 supplier code 240
## 5    33     part num. 273
## 2    27 schedule date 300

Pie Chart

Chapter 3 Section 1 Objectives

  • Develop the ability to measure the center of data by finding the mean, median, mode and midrange
  • Determine whether an outlier has a substantial effect on the mean and median

Definititions

  • Measure of Center - Value at the center or middle of the data set.
  • Mean - Measure of center found by adding all the data values together and dividing by the total number of data values.
  • Resistant - The presences of extreme values does not cause it to change much.
  • Median - Measure of center that is the middle value when the original data values are arranged in order of magnitude.
  • Mode - The value in the data set that has the greatest frequency.
  • Midrange - Measure of center that is the value midway between the maximum and minimum values in the data set.

Properties of the Mean

  • Sample means drawn from the same population tend to vary less than other measures of center
  • mean of a data set uses every data value
  • means are not resistant to outliers.

\[ \textrm{Mean} = \frac{1}{n}\Sigma{x}\] \(n\) is the number of values in the data set.

\(x\) is used to represent each individual value in a data set.

\(\Sigma\) denotes the sum of all data values.

Properties of Median

  • Median does not change by large amounts when there are a few extreme values. The median is resistant.
  • Median does not directly use every data value.

Odd Number of data values - The median is the middle value. Even Number of data values - The median is the mean of the 2 middle values.

Properties of Mode

  • The mode can be found with qualitative data
  • A data set can have no mode, one mode, or multiple mode

Properties of Midrange

  • Because midrange only uses the minimum and maximum data values it is very sensitive to extreme values. The midrange is not resistant
  • midrange is rarely used.